BindDB: An Integrated Database and Webtool Platform for "Reverse-ChIP" Epigenomic Analysis.

نویسندگان

  • Ilana Livyatan
  • Yair Aaronson
  • David Gokhman
  • Ran Ashkenazi
  • Eran Meshorer
چکیده

The high-throughput revolution has brought an unprecedented amount of epigenomic data for embryonic stem cells (ESCs), including genome-wide profiles of chromatin-bound proteins and histone modifications generated by chromatin immunoprecipitation assays (ChIP-seq). As dataset after dataset of ChIP-seq data is added to the pool, the time is ripe to reverse the viewpoint from being factor-oriented to the perspective of genomic locations in order to offer a comprehensive view of chromatin characteristics and regulatory elements that govern coherent genegroupsor chromosomal regions. Previously, we collected over 50 such genome-wide datasets in mouse (mESCs), and, using a bioinformatic pipeline which we developed, were able to identify novel regulators of histone genes (Gokhman et al., 2013), demonstrating the power of this approach. We have now built a dynamic database and webtool platform called BindDB that enables in silico ‘‘reverse-ChIP’’ analyses for widespread use within the stem cell community. BindDB includes a significantly expanded epigenomic database, which comprises a collection of over 450 genome-wide datasets in over 40 ESC and iPSC lines frommouse and human, integrated with a webtool that emulates our analysis pipeline. Our database foundation includes the ENCODE (ENCODE Project Consortium, 2012) and Roadmap (Bernstein et al., 2010) databases as well as hundreds of datasets from the GEO repository. It is kept current and up to date to incorporate new data as it becomes available. In addition, the interactive webtool enables any scientist, computational and non-computational alike, to query any number of genes or regions of interest against the database and receive a comprehensive epigenomic profile in ESCs. Anything from a single gene promoter to a set of thousands of unannotated chromosomal regions can be queried. For larger queries, qualitative enrichment scores and statistical assessment of significance are performed to focus the user on the factors and histone modifications specifically enriched within the query group. Results are also hierarchically clustered in an effort to facilitate not only the correlations between the factors themselves but also the subdivisions of the genes/ regions within the query group vis-a-vis epigenomic regulation. The BindDB webtool can be found at http://bind-db.huji.ac.il/bindDB/default_ new.php or by link from http://www. meshorerlab.huji.ac.il. The BindDB webtool receives a query of genomic regions or genes from the user as input via an interactive webform and then queries the database in order to determine which factors in the database have evidence of binding to the queried genes or regions. The single query section allows the user to query one gene and the portion of the gene to explore (promoter [proximal/ distal] and/or the gene body) or one unannotated location in the genome in the form of ‘‘chrN:start-end.’’ The multi-query section allows a user to upload a file containing either a list of gene symbols (Entrez, Refseq, or UCSC annotations) or a list of any genome coordinates in BED format. Once the ‘‘Get Epigenomic Profile’’ button is pressed, the query initiates to the database and provides several outputs: (1) a comma-separated-vector (.csv, Excel compatible) formatted raw results file of the epigenomic ‘‘barcodes’’ of the query regions, in the form of a binary matrix, where cell (i,j) contains the value of ‘‘1’’ if factor ‘‘j’’ binds queried region ‘‘i’’ and 0 if not; and (2) a dynamic (searchable, sortable) table of those factors, which bind one or more of the queried regions

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Complex Design of the Integrated Forward-Reverse Logistics Network under Uncertainty

Design of a logistics network in proper way provides a proper platform for efficient and effective supply chain management. This paper studies a multi-period, multi echelon and multi-product integrated forward-reverse logistics network under uncertainty. First, an efficient complex mixed-integer linear programming (MILP) model by considering some real-world assumptions is developed for the inte...

متن کامل

Application Mapping onto Network-on-Chip using Bypass Channel

Increasing the number of cores integrated on a chip and the problems of system on chips caused to emerge networks on chips. NoCs have features such as scalability and high performance. NoCs architecture provides communication infrastructure and in this way, the blocks were produced that their communication with each other made NoC. Due to increasing number of cores, the placement of the cores i...

متن کامل

ALEA: a toolbox for allele-specific epigenomics analysis

The assessment of expression and epigenomic status using sequencing based methods provides an unprecedented opportunity to identify and correlate allelic differences with epigenomic status. We present ALEA, a computational toolbox for allele-specific epigenomics analysis, which incorporates allelic variation data within existing resources, allowing for the identification of significant associat...

متن کامل

A Fully Integrated Range-Finder Based on the Line-Stripe Method

In this paper, an imaging chip for acquiring range information using by 0.35 μm CMOS technology and 5V power supply has been described. The system can extract range information without any mechanical movement and all the signal processing is done on the chip. All of the image sensors and mixed-signal processors are integrated in the chip. The design range is 1.5m-10m with 18 scales.

متن کامل

ePIANNO: ePIgenomics ANNOtation tool

Recently, with the development of next generation sequencing (NGS), the combination of chromatin immunoprecipitation (ChIP) and NGS, namely ChIP-seq, has become a powerful technique to capture potential genomic binding sites of regulatory factors, histone modifications and chromatin accessible regions. For most researchers, additional information including genomic variations on the TF binding s...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Cell stem cell

دوره 17 6  شماره 

صفحات  -

تاریخ انتشار 2015